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DETAILED ACTION 
Claim Objections 

1 . Claims 16 to 19 are objected to because of the following informalities: 

Claims 16 and 18 should be amended to better define the terms of the claims, by 
including the phrase "for states s and a set of observations T, and where yt T represents 
T observation frames of adaptation data". 

Appropriate correction is required. 

Claim Rejections - 35 USC § 102 

2. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

3. Claims 1, 7, 8, 14, and 15 are rejected under 35 U.S.C. 102(b) as being 
anticipated by Woodland et al. ("Iterative Unsupervised Adaptation Using Maximum 
Likelihood Linear Regression"). 

Regarding independent claims 1 f 8, and 15, Woodland etai discloses a method, 
apparatus, and computer program for adaptation in speech recognition, comprising: 

"providing at least one speech recognition model" - gender independent Hidden 
Markov Models (HMMs) HMM-1 and HMM-2 are built from acoustic training data sets 
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consisting of 36,493 sentences (Page 1 134, Left Column, Paragraphs 4 to 7; Page 
1 135, Right Column, Paragraph 3); 

"accepting speaker data" - test H3-P0 data was captured for each speaker of 20 
speakers (Page 1134, Right Column, Paragraph 7); 

"generating a word lattice having a plurality of paths based on the speaker data" 
- H3 development test data is used for lattice generation (Page 1 135, Left Column, 
Paragraphs 3 to 5: Table 1 ); word lattices are used to generate an error rate for H3-P0 
data (Page 1 136, Left Column, Lines 1 to 6: Table 3); implicitly, a word lattice has a 
plurality of paths; 

"wherein the step of generating the word lattice comprises considering language 
model probabilities" - the HTK LVCSR system uses a decoder to produce word lattices 
containing language model information for the application for rescoring of new language 
models (Page 1 134, Left Column, Paragraph 8); lattices generated by the HTK system 
contain a set of nodes that correspond to particular instants and arcs connecting these 
nodes that represent hypotheses for the time period between the two nodes; associated 
with each arc are both language model and acoustic model scores; lattices may contain 
copies of each word, and further copies can be required to encode the language model 
constraints (Page 1 135. Left Column, Paragraph 2); implicitly, language models 
comprise a set of "language model probabilities" (Wikipedia); 

"adapting at least one of the speaker data and the at least one speech 
recognition model with respect to the generated word lattice in a manner to maximize 
the likelihood of the speaker data" - language models were trained on the text training 
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corpus and the H3 text data sets; HMM-1 models used global MLLR adaptation and 
specific MLLR adaptation from word lattices for H3-P0 data; the result is a decreased 
error rate by adapting HMM-1 ("speech recognition model") to H3 data ("speaker data") 
using MLLR (Maximum Likelihood Linear Regression) (Page 1135, Right Column, 
Paragraph 5 to Page 1136, Right Column, Paragraph 2: Table 3). 

Regarding claims 7 and 14, Woodland et a/, discloses maximum likelihood linear 
regression (MLLR) for adaptation of speaker data in speech recognition (Page 1 133). 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 1 02 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 2 to 6 and 9 to 13 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Woodland et al. in view of Nguyen et al. 

Concerning claims 2 and 9, Woodland et al. discloses generating word lattices, 
but omits generating word lattices by maximum a-posteriori adaptation. However, 
Nguyen et ai teaches adaptation by both Maximum Likelihood Linear Regression 
(MLLR) and Maximum A Posteriori (MAP) adaptation, noting that both techniques are 
available to perform adaptation. It is stated that Bayesian-based MAP techniques are 
particularly useful in dealing with adaptation of sparse data sets, but in practical 
applications, depending upon the amount of adaptation data available, a combination of 
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both MLLR and MAP may be used. (Column 1 , Lines 50 to 60) Thus, Nguyen et ai 
performs adaptation with both MLLR and MAP. (Column 3, Line 57 to Column 4, Line 
32) It would have been obvious to one having ordinary skill in the art to generate a 
word lattice with maximum a posteriori adaptation as taught by Nguyen et a/, in MLLR 
adaptation with word lattices of Woodland et ai for the purpose of dealing with 
adaptation of sparse data sets in H3 training data. 

Concerning claims 3 and 10, Nguyen et a/, discloses Bayesian adaptation by 
MAP with Equation 4; y is the observed posterior probability of the observation to adapt 
the speech models ("posterior state occupancy probability"); |j M ap is found by summing 
the observed posterior probabilities over time: I y(t) o t and I y(t) ("posterior word 
occupancy probabilities by summing over all states interior to a word") (column 4, lines 
10 to 23); the adaptation system then processes the segments in an N-best pass to 
collect the most probable labels; model adaptation may be performed to adapt speech 
models to words ("at least one likely word at each frame") (column 3, lines 1 to 8; 
column 3, lines 47 to 56). 

Concerning claims 4 and 1 1 , Woodland et ai discloses word lattices (Page 1 135, 
Left Column); a word lattice implicitly contains word traces. 

Concerning claims 5 and 12, Woodland et ai discloses pruning during adaptation 
(Paragraph Bridging Pages 1 135 to 1 136), but does not expressly discard 
interpretations associated with low confidence. However, Nguyen et ai teaches 
assigning weights to the N-best transcriptions, so that reliable information becomes 
enhanced by a positive weight, and unreliable information is correspondingly diminished 
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by a negative weight. The system thus tends to push models that generate incorrect 
labels away from those that generate correct ones. Subsequently, model information is 
accumulated among the N-best transcriptions for the entire set of sentences and then 
used to adapt the speech models. (Column 3, Lines 32 to 56; Column 4, Lines 23 to 
59) Taking the N-best of the most reliable transcriptions necessarily implies eliminating 
transcriptions not associated with the N-best most reliable transcriptions ("discarding 
interpretations associated with low confidence"). N-best techniques are well known in 
speech recognition. Nguyen et a/, says assigning weights to the N-best transcriptions 
corresponding to their likelihoods produces a natural information and data corrective 
process. (Column 3, Lines 31 to 34) It would have been obvious to one having ordinary 
skill in the art to utilize the N-best technique of Nguyen et a/, to discard unreliable 
transcriptions for pruning in MLLR adaptation with word lattices of Woodland et al. for 
the purpose of producing a natural information corrective process. 

Concerning claims 6 and 13, Nguyen et al. discloses Bayesian adaptation by 
MAP with Equation 4; y is the observed posterior probability of the observation to adapt 
the speech models ("posterior phone probability") (column 4, lines 10 to 23); the 
observations and labels represent phonemes in speech recognition. 

Allowable Subject Matter 

6. Claims 16 to 19 are objected to as being dependent upon a rejected base claim, 
but would be allowable if rewritten in independent form including all of the limitations of 
the base claim and any intervening claims. 



f 
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Response to Arguments 

7. Applicants 1 arguments filed 28 February 2006 have been fully considered but 
they are not persuasive. 

Applicants argue that Woodland et al. does not generate a word lattice by 
considering language model probabilities. Applicants maintain that Woodland et a/.'s 
model scores may be broadly related to model probabilities, but are distinct subjects. 
Applicants state that a language model is a probability distribution, but that a language 
model score can result from several computations, either as a sum of all the 
probabilities of a subtree of a node or by a Viterbi algorithm. This position is not 
persuasive. 

It is appreciated that Applicants' distinction between language model probabilities 
and language model scores has merit, but the distinction does not overcome the 
rejection. A language model score represents a result for a particular set of 
observations given a current language model, as stated by Applicants. Strictly 
speaking, a language model score is different from the set of probabilities comprising a 
language model, insofar as a language model score is only one of a subset of 
probabilities from the language model for a given set of observations. However, 
language model probabilities are implicit characteristics of a language model. Wikipedia 
provides a definition of a "language model" by saying "statistical language models are 
probability distributions defined on sequences of words. . . Thus, in the sense 
stipulated by Applicants, every language model consists in a set of probability 
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distributions, so language model probabilities are inherent elements of a language 
model. For example, a bigram language model consists of a set of probabilities that a 
word wi is followed by a word w 2 , for every pair of words in a set of words w,- z W. Still, 
Woodland et al. discloses language models and language model scores, so language 
model probabilities are necessarily an inherent element of the language models 
disclosed by Woodland et a/. f as every language model is defined as a set of language 
model probabilities. Thus, Woodland et al. anticipates the claimed limitation of 
generating a word lattice by considering language model probabilities. 

Therefore, the rejections of claims 1, 7, 8, 14, and 15 under 35 U.S.C. 102(b) as 
being anticipated by Woodland et a/., and of claims 2 to 6 and 9 to 13 under 35 U.S.C. 
103(a) as being unpatentable over Woodland et al. in view of Nguyen et ai, are proper. 

Conclusion 

8. The prior art made of record and not relied upon is considered pertinent to 
Applicants' disclosure. 

Padmanabhan et al M Bahl et al., and Ephraim disclose related art. 

Wikipedia provides a definition of "language model". 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Martin Lerner whose telephone number is (571) 272- 
7608. The examiner can normally be reached on 8:30 AM to 6:00 PM Monday to 
Thursday. 
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If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David R. Hudspeth can be reached on (571) 272-7843. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
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